Unlabeled Data and Multiple Views

نویسنده

  • Zhi-Hua Zhou
چکیده

In many real-world applications there are usually abundant unlabeled data but the amount of labeled training examples are often limited, since labeling the data requires extensive human effort and expertise. Thus, exploiting unlabeled data to help improve the learning performance has attracted significant attention. Major techniques for this purpose include semi-supervised learning and active learning. These techniques were initially developed for data with a single view, that is, a single feature set ; while recent studies showed that for multi-view data, semi-supervised learning and active learning can amazingly well. This article briefly reviews some recent advances of this thread of research.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Detecting Changes in Unlabeled Data Streams Using Martingale

The martingale framework for detecting changes in data stream, currently only applicable to labeled data, is extended here to unlabeled data using clustering concept. The one-pass incremental changedetection algorithm (i) does not require a sliding window on the data stream, (ii) does not require monitoring the performance of the clustering algorithm as data points are streaming, and (iii) work...

متن کامل

Robust Multi-View Boosting with Priors

Many learning tasks for computer vision problems can be described by multiple views or multiple features. These views can be exploited in order to learn from unlabeled data, a.k.a. “multi-view learning”. In these methods, usually the classifiers iteratively label each other a subset of the unlabeled data and ignore the rest. In this work, we propose a new multi-view boosting algorithm that, unl...

متن کامل

Learning with Weak Views Based on Dependence Maximization Dimensionality Reduction

Large number of applications involving multiple views of data are coming into use, e.g., reporting news on the Internet by both text and video, identifying a person by both fingerprints and face images, etc. Meanwhile, labeling these data needs expensive efforts and thus most data are left unlabeled in many applications. Co-training can exploit the information of unlabeled data in multi-view sc...

متن کامل

Multi-view based unlabeled data selection using feature transformation methods for semiboost learning

SemiBoost [23] is a boosting framework for semi-supervised learning, in which unlabeled data as well as labeled data both contribute to learning. Various strategies have been proposed in the literature to perform the task of selecting useful unlabeled data in SemiBoost. Recently, a multi-view based strategy was proposed in [20], in which the feature set of the data is decomposed into subsets (i...

متن کامل

A Co-training based Framework for Writer Identification in Offline Handwriting

Traditional forensic document analysis methods have focused on feature-classification paradigm where a machine learning based classifier is used to learn discrimination among multiple writers. However, usage of such techniques is restricted to availability of a large labeled dataset which is not always feasible. In this paper, we propose a Cotraining based approach that overcomes this limitatio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011